HA System Fault Alarm

This alarm is applicable only to products supporting HA (Mediant 500, Mediant 800, Mediant 2600, Mediant 4000, Mediant 9000, and Mediant Software).

acHASystemFaultAlarm

Alarm

acHASystemFaultAlarm

OID

1.3.6.1.4.1.5003.9.10.1.21.2.0.33

Description

The alarm is sent when the High Availability (HA) system is faulty (i.e., no HA functionality).

Default Severity

Critical

Source Varbind Text

System#0/Module#<m>, where m is the blade module’s slot number

Event Type

qualityOfServiceAlarm

Probable Cause

outOfService

Severity

Condition

Text

Corrective Action

Critical

HA has failed to initialize because of a configuration error.

"SYS_HA: HA Remote address not configured, No HA system."

Configure a valid 'HA Remote Address'.

"SYS_HA: HA Remote address and Maintenance IF address are not on the same subnet, No HA system."

Configure a valid Maintenance network interface and 'HA Remote Address'.

"SYS_HA: HA Remote address and Maintenance IF address should be different, No HA system."

Configure a valid Maintenance network interface and 'HA Remote Address'.

HA is active, but the system is not operating in HA mode.

"Switch-Over: Reason = Fatal exception error"

HA was lost because of a switchover and should return automatically after a few minutes. Corrective action isn't required.

"Switch-Over: Reason = SW WD exception error"

HA was lost because of a switchover and should return automatically after a few minutes. Corrective action isn't required.

"Switch-Over: Reason = System error"

HA was lost because of a switchover caused by a general system error and should return automatically after a few minutes. Corrective action isn't required.

"Switch-Over: Reason = Eth link error"

HA was lost because of a switchover. Reconnect the Ethernet link.

"Switch-Over: Reason = Network Monitor error. Failed table rows index: <id 1> … up to <id 10>"

HA was lost because of a switchover caused by the HA Network Monitor feature as the threshold of unreachable rows (in the HA Network Monitor table) was exceeded. The indices of these unreachable rows are provided in the alarm's text. The HA mode should return automatically after a few minutes. Corrective action isn't required.

"Switch-Over: Reason = Keep Alive error"

HA was lost due because of a switchover and should return automatically after a few minutes. Corrective action isn't required.

"Switch-Over: Reason = DSP error"

HA was lost because of a switchover and should return automatically after a few minutes. Corrective action isn't required.

Note: Applicable only to Mediant 4000.

"Switch-Over: Reason = Software upgrade"

HA was lost because of a switchover and should return automatically after a few minutes. Corrective action isn't required.

"Switch-Over: Reason = Software upgrade - switch back"

HA was lost because of a switchover caused by the Hitless Software Upgrade process that switched from active to redundant device, and should return automatically. Corrective action isn't required.

"Switch-Over: Reason = Fk upgrade"

HA was lost because of a switchover caused by a Hitless License Upgrade process and should return automatically after a few minutes. Corrective action isn't required.

“Switch-Over: Reason = Manual switch over"

HA was lost because of a switchover and should return automatically after a few minutes. Corrective action isn't required.

"Switch-Over: Reason = Higher HA priority"

HA was lost because of a switchover to the device with the higher HA priority and should return automatically after a few minutes. Corrective action isn't required.

Major

HA feature is active, but the system is not operating in HA mode.

"SYS_HA: Invalid Network configuration, fix it and reboot Redundant unit - no HA system!"

HA synchronization process failed. Correct invalid network configuration and then restart the Redundant device to trigger HA synchronization again.

“SYS_HA: Offline configuration was changed, HA is not available until next system reboot.”

HA synchronization process failed. Changing configuration that requires a device restart to apply the new configuration must be done before the standalone system can become HA again.

“SYS_HA: Redundant is not reconnecting after deliberate restart, No HA system.”

HA synchronization process failed. Manually restart the Redundant device.

The system is no longer in HA mode because the redundant device is restarting or disconnected from the active device. For example, this can occur during a hitless software upgrade when the redundant device burns the new firmware and then restarts to apply it.

"HA is not operational: redundant unit error/reset reason - <fault description, e.g., Software Upgrade>."

-

The redundant device disconnected from the HA system and the active device is now in standalone mode.

"HA is not operational: Redundant unit is disconnected."

-

The active device is in standalone mode and then the redundant device joins HA and synchronizes with the active device.

"HA is not operational: synchronizing redundant unit's state and configuration."

-

The active device is in standalone mode and then the redundant device joins HA, but they are running different software versions (.cmp). Therefore, the redundant device gets the .cmp file from the active device (so that they run the same software version).

"HA is not operational: updating redundant unit's software version."

-

An offline parameter (i.e., requires a device restart) is modified on the active device. An HA switchover occurs, the redundant device (previously active device) restarts to apply the new settings, and synchronization between active and redundant devices occur.

"HA is not operational: redundant unit is restarting to apply new configuration."

-

Minor

The HA Network Monitor feature isn't the cause of an HA switchover because the 'Preempt Mode' parameter is configured to Enable and the 'Preempt Priority' is configured to a level.

"Network Monitor switch-over is blocked when HA Preemptive mode and Priority is configured"

-

The HA Network Monitor feature isn't the cause of an HA switchover because the number of Ethernet Groups (Ethernet links) in the redundant device in "up" status is less than on the active device.

"Network Monitor switch-over is blocked when status of Ethernet links on redundant is worse than on active unit"

-

The Maintenance Events Monitoring feature is enabled (MaintenanceEventsMonitoringEnable) and the cloud platform performs a maintenance event on the virtual machine hosting the active device, causing an HA switchover to the redundant device.

Note: This condition is applicable only to Mediant VE SBC and when it's [MaintenanceEventsMonitoringEnable] parameter is enabled and [MaintenanceEventsTreatmentEnable] disabled.

"HA is not operational: switch-over from Active to Redundant unit, Switch over reason - VM maintenance event"

-

The Ethernet Group associated with the Maintenance IP interface (used for HA systems) is configured with two ports, but one of them is down (i.e., no 1+1 Ethernet port redundancy).

"SYS_HA: Maintenance redundant link is down - no HA maintenance link redundancy"

Make sure that the network cable is firmly plugged into the Ethernet port.
Make sure that the other end of the network cable is correctly connected to the network.

Cleared

The HA system is active and operational.

"HA is operational"

-